Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
Contrastive deep graph clustering, which aims to divide nodes into disjoint groups via contrastive mechanisms, is a challenging research spot. Among the recent works, hard sample mining-based algorithms have achieved great attention for their promising performance. However, we find that the existing hard sample mining methods have two problems as follows. 1) In the hardness measurement, the important structural information is overlooked for similarity calculation, degrading the representativeness of the selected hard negative samples. 2) Previous works merely focus on the hard negative sample pairs while neglecting the hard positive sample pairs. Nevertheless, samples within the same cluster but with low similarity should also be carefully learned. To solve the problems, we propose a novel contrastive deep graph clustering method dubbed Hard Sample Aware Network (HSAN) by introducing a comprehensive similarity measure criterion and a general dynamic sample weighing strategy. Concretely, in our algorithm, the similarities between samples are calculated by considering both the attribute embeddings and the structure embeddings, better revealing sample relationships and assisting hardness measurement. Moreover, under the guidance of the carefully collected high-confidence clustering information, our proposed weight modulating function will first recognize the positive and negative samples and then dynamically up-weight the hard sample pairs while down-weighting the easy ones. In this way, our method can mine not only the hard negative samples but also the hard positive sample, thus improving the discriminative capability of the samples further. Extensive experiments and analyses demonstrate the superiority and effectiveness of our proposed method.
translated by 谷歌翻译
Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research direction. It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering and recommendation systems, etc. According to the graph types, the existing KGR models can be roughly divided into three categories, \textit{i.e.,} static models, temporal models, and multi-modal models. The early works in this domain mainly focus on static KGR and tend to directly apply general knowledge graph embedding models to the reasoning task. However, these models are not suitable for more complex but practical tasks, such as inductive static KGR, temporal KGR, and multi-modal KGR. To this end, multiple works have been developed recently, but no survey papers and open-source repositories comprehensively summarize and discuss models in this important direction. To fill the gap, we conduct a survey for knowledge graph reasoning tracing from static to temporal and then to multi-modal KGs. Concretely, the preliminaries, summaries of KGR models, and typical datasets are introduced and discussed consequently. Moreover, we discuss the challenges and potential opportunities. The corresponding open-source repository is shared on GitHub: https://github.com/LIANGKE23/Awesome-Knowledge-Graph-Reasoning.
translated by 谷歌翻译
Graph contrastive learning is an important method for deep graph clustering. The existing methods first generate the graph views with stochastic augmentations and then train the network with a cross-view consistency principle. Although good performance has been achieved, we observe that the existing augmentation methods are usually random and rely on pre-defined augmentations, which is insufficient and lacks negotiation between the final clustering task. To solve the problem, we propose a novel Graph Contrastive Clustering method with the Learnable graph Data Augmentation (GCC-LDA), which is optimized completely by the neural networks. An adversarial learning mechanism is designed to keep cross-view consistency in the latent space while ensuring the diversity of augmented views. In our framework, a structure augmentor and an attribute augmentor are constructed for augmentation learning in both structure level and attribute level. To improve the reliability of the learned affinity matrix, clustering is introduced to the learning procedure and the learned affinity matrix is refined with both the high-confidence pseudo-label matrix and the cross-view sample similarity matrix. During the training procedure, to provide persistent optimization for the learned view, we design a two-stage training strategy to obtain more reliable clustering information. Extensive experimental results demonstrate the effectiveness of GCC-LDA on six benchmark datasets.
translated by 谷歌翻译
Knowledge graph embedding (KGE) aims to learn powerful representations to benefit various artificial intelligence applications, such as question answering and recommendations. Meanwhile, contrastive learning (CL), as an effective mechanism to enhance the discriminative capacity of the learned representations, has been leveraged in different fields, especially graph-based models. However, since the structures of knowledge graphs (KGs) are usually more complicated compared to homogeneous graphs, it is hard to construct appropriate contrastive sample pairs. In this paper, we find that the entities within a symmetrical structure are usually more similar and correlated. This key property can be utilized to construct contrastive positive pairs for contrastive learning. Following the ideas above, we propose a relational symmetrical structure based knowledge graph contrastive learning framework, termed KGE-SymCL, which leverages the symmetrical structure information in KGs to enhance the discriminative ability of KGE models. Concretely, a plug-and-play approach is designed by taking the entities in the relational symmetrical positions as the positive samples. Besides, a self-supervised alignment loss is used to pull together the constructed positive sample pairs for contrastive learning. Extensive experimental results on benchmark datasets have verified the good generalization and superiority of the proposed framework.
translated by 谷歌翻译
近年来,图形神经网络(GNNS)在半监督节点分类中实现了有希望的性能。但是,监督不足的问题以及代表性崩溃,在很大程度上限制了GNN在该领域的性能。为了减轻半监督场景中节点表示的崩溃,我们提出了一种新型的图形对比学习方法,称为混合图对比度网络(MGCN)。在我们的方法中,我们通过扩大决策边界的边距并提高潜在表示的跨视图一致性来提高潜在特征的歧视能力。具体而言,我们首先采用了基于插值的策略来在潜在空间中进行数据增强,然后迫使预测模型在样本之间进行线性更改。其次,我们使学习的网络能够通过强迫跨视图的相关矩阵近似身份矩阵来分开两个插值扰动视图的样品。通过结合两个设置,我们从丰富的未标记节点和罕见但有价值的标记节点中提取丰富的监督信息,以进行判别表示学习。六个数据集的广泛实验结果证明了与现有最​​新方法相比,MGCN的有效性和普遍性。
translated by 谷歌翻译
对比度学习最近引起了深度群集的充满希望的表现。但是,复杂的数据增强和耗时的图卷积操作破坏了这些方法的效率。为了解决此问题,我们提出了一种简单的对比度图聚类(SCGC)算法,以从网络体系结构,数据增强和目标函数的角度改进现有方法。至于架构,我们的网络包括两个主要部分,即预处理和网络骨干。一个简单的低通denoising操作将邻居信息聚合作为独立的预处理,仅包括两个多层感知器(MLP)作为骨干。对于数据增强,我们没有通过图形引入复杂操作,而是通过设计参数UNSHARED SIAMESE编码并直接损坏节点嵌入的参数来构造同一顶点的两个增强视图。最后,关于目标函数,为了进一步提高聚类性能,新型的跨视图结构一致性目标函数旨在增强学习网络的判别能力。七个基准数据集的广泛实验结果验证了我们提出的算法的有效性和优势。值得注意的是,我们的算法的表现超过了最近的对比群集竞争对手,平均速度至少七倍。
translated by 谷歌翻译
长期以来,半监督学习(SSL)已被证明是一种有限的标签模型的有效技术。在现有的文献中,基于一致性的基于正则化的方法,这些方法迫使扰动样本具有类似的预测,而原始的样本则引起了极大的关注。但是,我们观察到,当标签变得极为有限时,例如,每个类别的2或3标签时,此类方法的性能会大大降低。我们的实证研究发现,主要问题在于语义信息在数据增强过程中的漂移。当提供足够的监督时,可以缓解问题。但是,如果几乎没有指导,错误的正则化将误导网络并破坏算法的性能。为了解决该问题,我们(1)提出了一种基于插值的方法来构建更可靠的正样品对; (2)设计一种新颖的对比损失,以指导学习网络的嵌入以在样品之间进行线性更改,从而通过扩大保证金决策边界来提高网络的歧视能力。由于未引入破坏性正则化,因此我们提出的算法的性能在很大程度上得到了改善。具体而言,所提出的算法的表现优于第二好算法(COMATT),而当CIFAR-10数据集中的每个类只有两个标签可用时,可以实现88.73%的分类精度,占5.3%。此外,我们通过通过我们提出的策略大大改善现有最新算法的性能,进一步证明了所提出的方法的普遍性。
translated by 谷歌翻译
深图形聚类,旨在揭示底层的图形结构并将节点划分为不同的群体,近年来引起了密集的关注。然而,我们观察到,在节点编码的过程中,现有方法遭受表示崩溃,这倾向于将所有数据映射到相同的表示中。因此,节点表示的鉴别能力是有限的,导致不满足的聚类性能。为了解决这个问题,我们提出了一种新颖的自我监督的深图聚类方法,通过以双向还原信息相关性来称呼双重关联减少网络(DCRN)。具体而言,在我们的方法中,我们首先将暹罗网络设计为编码样本。然后通过强制跨视图样本相关矩阵和跨视图特征相关矩阵分别近似两个标识矩阵,我们减少了双级的信息相关性,从而提高了所得特征的判别能力。此外,为了减轻通过在GCN中过度平滑引起的表示崩溃,我们引入了传播正规化术语,使网络能够利用浅网络结构获得远程信息。六个基准数据集的广泛实验结果证明了提出的DCRN对现有最先进方法的有效性。
translated by 谷歌翻译
图形表示学习(GRL)属性缺失的图表,这是一个常见的难以具有挑战性的问题,最近引起了相当大的关注。我们观察到现有文献:1)隔离属性和结构嵌入的学习因此未能采取两种类型的信息的充分优势; 2)对潜伏空间变量的分布假设施加过于严格的分布假设,从而导致差异较少的特征表示。在本文中,基于在两个信息源之间引入亲密信息交互的想法,我们提出了我们的暹罗属性丢失的图形自动编码器(SAGA)。具体而言,已经进行了三种策略。首先,我们通过引入暹罗网络结构来共享两个进程学习的参数来纠缠嵌入属性嵌入和结构嵌入,这允许网络培训从更丰富和不同的信息中受益。其次,我们介绍了一个K到最近的邻居(knn)和结构约束,增强了学习机制,通过过滤不可靠的连接来提高缺失属性的潜在特征的质量。第三,我们手动掩盖多个相邻矩阵上的连接,并强力嵌入子网恢复真正的相邻矩阵,从而强制实现所得到的网络能够选择性地利用更高级别的判别特征来进行数据完成。六个基准数据集上的广泛实验表明了我们传奇的优越性,反对最先进的方法。
translated by 谷歌翻译